† Corresponding author. E-mail:
The emergence of Event-based Social Network (EBSN) data that contain both social and event information has cleared the way to study the social interactive relationship between the virtual interactions and physical interactions. In existing studies, it is not really clear which factors affect event similarity between online friends and the influence degree of each factor. In this study, a multi-layer network based on the Plancast service data is constructed. The the user’s events belongingness is shuffled by constructing two null models to detect offline event similarity between online friends. The results indicate that there is a strong correlation between online social proximity and offline event similarity. The micro-scale structures at multi-levels of the Plancast online social network are also maintained by constructing 0k–3k null models to study how the micro-scale characteristics of online networks affect the similarity of offline events. It is found that the assortativity pattern is a significant micro-scale characteristic to maintain offline event similarity. Finally, we study how structural diversity of online friends affects the offline event similarity. We find that the subgraph structure of common friends has no positive impact on event similarity while the number of common friends plays a key role, which is different from other studies. In addition, we discuss the randomness of different null models, which can measure the degree of information availability in privacy protection. Our study not only uncovers the factors that affect offline event similarity between friends but also presents a framework for understanding the pattern of human mobility.
Online social network services have become the fastest growing applications on the Internet. On these virtual networks, users communicate with each other and share all kinds of information. Actually, the newly appeared data sources, like Event-based Social Networks (EBSNs) data[1] and Location-based Social Networks (LBSNs) data,[2] contain not only online virtual interactions as traditional social networks,but also offline physical interactions of users, which make it possible to combine virtual interactions with physical social ones.How to make use of this kind of interactive relationship to provide users with more convenience is one of the hotspot researches in recent years. As of March 2017, Plancast had more than 50 million registered users and published an average of 104 events a day. An event is a special case that is caused by many reasons and conditions, occurs at a certain time and place, and may be accompanied by certain inevitable results. Understanding event similarity has direct potential applications for recommending all kinds of services,[3] building smart cities,[4] and relieving traffic pressures[5]. Therefore, the relevant studies of social networks and their applications are of great significance.[6–9]
The traditional calculation of event similarity is based on event six-section definitions, which only involve the action, object, environment, assertion, language expression, and time factor of the event element itself.[10] However, this neglects the influence of other external factors on event similarity. Therefore, the results are not comparable between different studies. In this work, we use the method by constructing null models, which accurately uncovers the factors that influence the offline event similarity between friends.[11,12] Null networks can provide an accurate reference for the original network, and can accurately describe the non-trivial characteristics of the original network combined with the statistic indicators, which can help us to reveal the origin and the level of complexity. The term ‘null model’ was proposed by Colwell et al. in a conference. Generally, a randomized network with some of the same properties as the real-life network, is called a null network of the original network.[13] Null networks have been widely applied in analyzing clustering coefficient,[14,15] degree distribution,[16] link prediction,[17,18] and community detection.[19,20] In this study, all null networks are generated by a random rewiring algorithm, which randomly shuffles the edges of the original network and makes the original network as random as possible.[21,22]
EBSNs contain not only online social relationship of friends but also their offline events information in actual life. The Plancast is a typical event-based network and the dataset is big enough for our purposes. The online social network contains 76665 nodes and 1702058 edges, and the offline network contains 401634 events. In existing studies of the Plancast network, it is not very clear which factors affect event similarity between friends and the influence degree of each factor. For example, we do not know whether there is a relationship between online friends and their offline events. If so, how the online social network affects offline event similarity.[23] Furthermore, the influence of the number and subgraph structure of common neighbors on offline event similarity is not clear.[24,25]
In this study we constructed a double-layer network by utilizing the Plancast data, and measured offline event similarity by Hub Promoted Index (HPI).[26,27] The offline event similarity in Plancast was first detected by constructing two null models which changed the belongingness of events and calculated HPI of each null model. We found that the events of friends showed a strong similarity; that is, once we changed the belongingness of events, and the event similarity of friends would be greatly reduced, which was indicated that there is a strong correlation between online social proximity and offline event similarity. Then, the micro-scale structures at multi levels were maintained by constructing 0k–3k null networks to study how these characteristics (average degree, degree distribution, assortativity, and clustering coefficient) affected offline event similarity. We found that average degree and degree distribution (0k and 1k characteristics) were not enough to maintain offline event similarity, but assortativity (2k characteristic) is a significant characteristic for maintaining offline event similarity. Finally, by calculating the HPI of different number and structure of common friends, we found that the subgraph structure of common friends had no positive impact on event similarity while the number of common friends played a key role, which was different from the results reported in [25]. In addition, we discussed the randomness of different null models, which can measure the degree of information availability in privacy protection.
Recently, event-based online social services, such as Plancast and Eventbrite, have experienced rapid growth. From these services, researchers observe a new type of social network, which makes it possible to combine virtual and physical social interactions. Figure
In this study a multi-layer network is constructed by utilizing the Plancast data, and the constructing principle is shown in Fig.
If two users are involved in more events together, then they will be considered to be more similar. In recent years, researchers have developed a number of indexes to measure similarity, such as Jaccard index, Hub Depressed Index (HDI), and Hub Promoted Index (HPI).[26,27] These are common indexes in analyzing user similarity. For a node x, let kx represent the degree of x and
Like conventional online social networks, EBSNs provide an online virtual world where users exchange thoughts and share experiences. What distinguishes EBSNs from conventional social networks is that EBSNs also capture the effect of face-to-face social interactions in participating events in the offline physical world. To study the relationship between online social proximity and offline event similarity, we introduce two kinds of null models, which can not only change offline event chains but also maintain the social relationship in online social networks.
The construction of the null model of exchanging trajectory is shown in Fig.
The construction of the null model of exchanging user events is shown in Fig.
The null model of exchanging trajectory does not change the order of the event sequence but shuffles the ownership of the whole trajectory for each user. The null model of exchanging user events not only changes the ownership of a whole trajectory but also shuffles the set of events belong to each user. Both null models can change the offline event characteristics, and detect the correlation between offline events similarity and online social proximity. This correlation plays an important role in coordinating online and offline resources, such as recommending offline events based on online relationships.
In this study, the indexes above-mentioned (Jaccard, HDI, and HPI) have been applied and the results are similar. HPI is more observable, so the result of event similarity is shown by HPI in this study. The HPI distribution of the original network and its null models of exchanging trajectory are shown in Fig.
The traditional methods have simply studied event similarity by calculating the similarity between the factors of event itself. However, these methods cannot meet the research requirements of actual complex systems because event similarity is often affected by other factors, such as micro-scale structures of online social networks. Actually, 0k–3k null networks can maintain different micro-scale characteristics (such as average degree, degree distribution, assortativity, and clustering coefficient) of the original network.The null networks of different orders are interrelated, that is,
The 0k null network is the simplest and most randomized null model, which only possesses the same number of nodes and the average degree as a given graph
The 1k null network possesses the same degree distribution (or sequence) as the original network. The degree distribution refers to the probability of nodes’ degree in the original network. If
The 2k null network possesses the same joint degree distribution as the original network. The joint degree distribution refers to the number of degree values (probability) of the nodes connected at both ends of each edge. Suppose that k represents the degree of a node and that
The 3k null network possesses the same joint edge-degree distribution
While constructing the 3k null networks, links in the offline network remain unchanged while those in online network should be shuffled. Two edges are randomly select in the original network to ensure that the joint degree distribution remains the same; that is, the basic properties of the 2k characteristics remain the same. The number of open triangle and close triangle for each node on both ends of two links and its neighbor nodes are then calculated before and after shuffling, respectively. If the numbers are the same, then we can conclude that the shuffling is successful. For example, referring to the constraints of generating the 3k null network, we disconnect two edges A–B and C–D. If the pairs of nodes A–D and B–C are not connected, then we connect them. The result of the reconnection is shown in Fig.
The HPI distribution of the Plancast network and its 0k–3k null networks are shown in Fig.
We find that the HPI distribution of 0k and 1k null network are less than that of the original network. Compared with 0k and 1k null networks, the HPI distribution of 2k null networks are very close to the original network because they maintain more micro-scale structure characteristics of the original (i.e., assortativity). Furthermore, the HPI distribution of 3k null networks are more similar to that of the original network because 3k null networks keep the clustering characteristic of the original. These results suggest that the average degree (0k characteristic) and degree distribution (1k characteristic) are not enough to maintain event similarity between friends,but assortativity (2k characteristic) and clustering coefficient (3k characteristic) are significant micro-scale structures to maintain event similarity between friends.
Studies on network structure characteristics can help us to reveal the nature of complex systems, and the relationship between network structure and its function. Therefore, more researchers are devoting themselves to study the diversity of network structures.[25,31] Actually, the number and subgraph structure of common neighbors are the important indexes to reflect the structural characteristics of social networks.[24] However, these two factors have different effects on real networks in existing studies. For example, the researchers in [31] have shown that the number of common neighbors has greater impact on the similarity of network structure. In contrast, it is found in [25] that the number of common neighbors has no influence on social networks while the subgraph structure of common neighbors has a positive influence. In this chapter, we study the influence of these two factors on offline event similarity between friends.
Common Neighbors (CN) index is a common indicator to explore the structural similarity of two users in social networks. The definition of CN is
The other index that can explore the structural diversity is the subgraph structure of common neighbors. We measure this index by the Number of small Connected Components (NCC) in the induced subgraph of common neighbors. For example, in Fig.
We calculate the number of common neighbors in the original network and its corresponding 1k–3k null networks respectively to study event similarity between friends, and the corresponding HPI results are shown in Fig.
The results in Fig.
In Fig.
Figure
The event data bring not only huge benefits to people but also the harm of personal information leakage.[33] This happens because event data not only directly contains user privacy information but also implies personality habits, health status, social status, and other sensitive information about the users. Once event data are improperly used, all aspects of users’ privacy will be seriously threatened. Actually, the randomness of different null networks corresponds to the degree of information available for privacy protection. We calculate the HPI values of six null models, and the results are shown in Fig.
In summary, we use Plancast data to build a double-layer network. By constructing null models of the EBSN network, we study the factors that affect event similarity between friends and the influence level of each factor. First, two null models of exchanging trajectory and exchanging user events are proposed to change the user’s events belongingness to detect offline event similarity between friends. Our experimental results show that there is a strong correlation between online social proximity and offline event similarity. To be specific, the events of friends show a similarity; that is, once we change the belongingness of events, the event similarity of friends will be greatly reduced. This correlation can help us to make offline event recommendations based on online social interactions.
Second, the influence of micro-scale network structures on the event similarity between friends is studied by constructing 0k–3k null networks of the original network. Our results indicate that the average degree and degree distribution (0k and 1k characteristics) are not enough to maintain offline event similarity between friends, but assortativity (2k characteristic)and high order features (such as clustering coefficient) are significant micro-scale characteristics for maintaining offline event similarity between friends. This is helpful for researchers who wish to understand the pattern of human mobility from online network topology.
Finally, we study the influence of structural diversity of online friends on offline event similarity. Our experimental results show that the structural diversity of common neighbors has no positive impact on event similarity, while the number of common friends plays a key role. In addition, the randomness of different null models that can measure the degree of information availability in privacy protection are discussed. Our study not only uncovers the factors that affect offline event similarity between friends but also presents a framework for understanding the pattern of human mobility in cities.
[1] | |
[2] | |
[3] | |
[4] | |
[5] | |
[6] | |
[7] | |
[8] | |
[9] | |
[10] | |
[11] | |
[12] | |
[13] | |
[14] | |
[15] | |
[16] | |
[17] | |
[18] | |
[19] | |
[20] | |
[21] | |
[22] | |
[23] | |
[24] | |
[25] | |
[26] | |
[27] | |
[28] | |
[29] | |
[30] | |
[31] | |
[32] | |
[33] |